3+5[1] 8
Creating objects, using functions, and more!
Dr William Kay
October 28, 2025
Here’s what you should know and be able to do after completing this practical:
LO1: Use R like a calculator (addition, subtraction, division, multiplication)
LO2: Perform some basic functions (like the square root function)
LO3: Create simple objects
LO4: Create sequences and repeats of numbers
LO5: Apply basic functions on objects
LO6: Calculate some basic descriptive statistics
LO7: Install and load packages in R
LO8: Cite R and R packages
You should copy code in this tutorial and paste it into your script in R. Then, click on the line of code and then click “Run” in the top-right of your script pane (or press Ctrl+R or Ctrl+Enter)
R can be used just like a calculator
NOTE: The usual rules of precedence apply (multiplication happens before addition):
You can use parentheses to make sure that specific code is run first:
NOTE: You cannot use curly {} or square [] brackets for this (they have other functions)
R can be used to perform power transformations
For example, you can raise 10 to the power of 4 to get 10,000:
Raising 10 to the power of 4 is the same as doing 10x10x10x10
If you raise something to the power of 0.5, that is the same as taking the square root.
Remember that a square root is just “the number that, when multiplied by itself, gives the original number.”
For example, the square root of 100 is 10, because when 10 is multiplied by itself you get 100:
NOTE: Raising a number to the power of 0.5 does the same thing:
The sqrt() above is the first example of a function in R (more on this later)
R also understands fractions:
Strictly speaking the above is 0.333 recurring
R has lots of useful things built into it:
The 26 upper-case letters of the Roman alphabet letters:
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"
The 26 lower-case letters of the Roman alphabet:
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"
The three-letter abbreviations for the English month names:
The English names for the months of the year:
[1] "January" "February" "March" "April" "May" "June"
[7] "July" "August" "September" "October" "November" "December"
The ratio of the circumference of a circle to its diameter:
R can do trigonometry!
Don’t worry about the mathematics - I just want you to know what’s possible:
R understands infinity (and minus infinity):
R also understands things that aren’t numbers.
For example if you divide zero by zero, you get “NaN” (“Not a Number”)
Creating objects is probably the most important thing you’ll learn in R
Objects can be anything you want them to be. Numbers, words, datasets, pictures, and more!
We can create “objects” by using the “assignment operator”: <-
For example, create a new object called “data” which contains the numbers 1 to 5:
This “data” object will now appear in your Environment (top-right)
Now, if we simply type the name of that object and run it, we will see what it contains:
We can also create objects that contain words.
For example, here’s a motivational phrase:
Again, we can type the name of that object and we’ll see that phrase:
This might seem a little abstract now, but soon you will be creating objects that consist of data that we plot and perform statistical analysis on!
Now it’s your turn to have a go:
Write the code to create an object called “new” and put the numbers 1 to 10 in that object.
Equally as important as creating objects is being able to use functions
Functions are specific blocks of code that perform specific tasks
For example, let’s look at the sum() function, which simply sums things together:
A very useful function in R is “concatenate”. This is written as c()
This function groups things together (that’s what “concatenate” means!)
Here’s an example:
Let’s say we wanted to group some numbers together
NOTE: You can’t run just a list of numbers. You will get an error message:
You also cannot do this even if the numbers are inside parentheses.
You will still get an error:
We can only do this if we use the concatenate function: c()
And you have to put the numbers inside parentheses, with each number separated by a comma
Group together some even numbers:
We could store these numbers inside an object:
We can now perform mathematical calculations on these numbers:
You can also group words together and store these in an object:
But of course you can’t perform mathematical calculations on words!
The following will give you an error:
Now it’s your turn.
Write your own code to create an object called “names” and store inside that object your own name and the name of the person sitting next to you.
What you’ve just had a go at above is creating an object and then doing something with it. This is actually one of the most common things you’ll be doing in R. So, let’s have some more practice.
Create an object (called “obj”), which contains 4 numbers:
Now we can do things like add 5 to all of the numbers in that object:
Or multiply them all by 10:
Let’s create a different object, called “x1”.
x1 will contain the letter A and some words:
NOTE: R is “case-sensitive”
This means that you must be careful to use lower-case or upper-case correctly
This ALWAYS catches beginners out!
For example, now that you have created the “x1” object, if you type “X1”, it won’t work!
This tells you object ‘X1’ not found, because you didn’t create X1, you created x1
In exactly the same way, if you tried to run “Obj” that would give you an error too:
Because as far as R knows, nothing called “Obj” exists (only the lower case “obj”)
We can now perform the structure function str() on any object (“str” stands for “Structure”).
The structure function tells us about an object - in this case the x1 object is a “Character vector” (a list of characters i.e., words or symbols):
It also tells us that this object contains 1 to 5 elements [1:5] and that specifically these elements are “A”, “mean”, “mode”, “variance”, and “statistics”.
In the “Introduction to R Coding” lecture you saw different types of objects we can create in R. Specifically, you saw Character, Factor, Integer, and Numeric objects.
Let’s create an example of each.
A character object just contains words or symbols:
An integer is an object that contains only whole numbers:
A numeric object is a number that also contains decimal places:
Next we will create a factor. A factor is an object that can contain both words and numbers. However, the important thing to remember is that a factor object has categories.
So, we could create category1, category2, etc.
Importantly, if the same thing appears multiple times in a factor, R will understand that this represents multiple observations from the same category.
To create a factor you need to use the function as.factor():
Factor w/ 3 levels "category1","category2",..: 1 1 2 3
Notice how the factor looks different to the character. R detects that there are 3 levels (categories).
A factor recognises that there are specific categorical groups, whereas a character variable just treats each element as completely unique.
Now you have a go.
Create a factor called “exp” which contains a “control” and “treatment” group.
R can be used to generate sequences of numbers using the seq() function:
Let’s sequence numbers from 0 to 10 in intervals of 1:
Or sequence from 0 to 1000 in intervals of 50:
We can use the function rep() to generate repeated things:
This will repeat 4, 10 times:
Or the letter A, 5 times:
Remember the character object we created earlier, called “chr”?
We could apply the rep() function to that object.
For example, ask R to repeat “chr” twice:
You can ask R for help with a function by putting a question mark in front of the function:
This will bring up a help page that tells you about that function.
These pages can be a bit tricky to understand at first, but with practice you will learn how to interpret them.
If you scroll down to the very bottom of that help page you can see some “Examples” that demonstrate how the function can be used.
For the rep() function, a good way to use it is by stating each = X where X is the number of times you want each number to be repeated.
For example, let’s repeat the numbers 1, 2, 3 each 3 times:
What you have just seen above (the use of “each” inside the rep() function) is an example of an “Argument”.
Arguments are the specific instructions we give to # different functions to specify exactly how those functions are to be used.
The rep() function can use the times argument instead of each. This will change how many times the thing that you specify is repeated.
For example, repeat “1, 2, 3” three times:
You should be able to see how this is different to the previous example.
Compare them:
R can very easily and quickly calculate basic descriptive statistics.
Let’s create an object called “numbers” and then calculate some basic statistics:
Use the sum() function to add all the numbers together:
The min() function tells you the minimum number:
max() tells you the maximum number:
range() shows you both the minimum and maximum:
median() shows you the middle number:
mean() calculates the mean value (add them all up and divide by how many there are):
The sd() function calculates the Standard Deviation (a measure of variability):
The var() calculates the variance (this is another measure of variability)
In fact, the variance is the standard deviation squared:
We can store results of our calculations in new objects.
So here, we apply the mean() function to the numbers object, and store the result of that in a new object called “avg”:
Calculating a mean is something you will probably need to do quite a lot.
It’s important therefore to know that whenever we calculate a mean, what we are actually doing is estimating a value. Because this is an estimation, it will come with some error.
When we calculate a mean we therefore must always also calculate how much error we have in our estimate. This is called the Standard Error of the Mean (SEM).
If we want to calculate the SEM, we can do it like this…
First we can create a function:
NOTE: I’m not expecting you to be able to create your own functions - this is quite advanced!
But for anyone who is interested, all this line of code above does is tell R that we want to create a new function which we can apply to “x”.
“x” in this case represents anything we choose to apply that function to.
Specifically, we want the function to calculate the standard deviation of x. Then, divide that by the square root of how many observations there are in x.
NOTE: The length() function calculates how many elements there are in an object.
Let’s quickly see length() in action in a simple example:
In our “numbers” object there are 5 numbers (1, 5, 10, 100 and 200).
So length will give us the answer 5.
So, in summary, what the Standard Error of the Mean (SEM) function does is calculates the standard deviation (the variability) of the data and then divides that by the square root of how many data points there are.
If you want to see that written mathematically, it’s:
\[ \text{SEM} = \frac{s}{\sqrt{n}} \] We can apply the SEM function now to our numbers object.
NOTE: We apply the SEM function to the raw data, not the estimated mean itself.
The SEM essentially represents how precise our estimated mean is.
So the mean of “numbers” could be reported as:
63.2 ± 38.82963
The 38.82963 here represents how much error there is around our estimated mean value of 63.2.
Whenever you collect data in science, there’s a chance that you may not be able to collect all of the observations you had planned to.
In those cases, you are likely to have missing observations.
For example, imagine I wanted to measure the heights (cm) of 5 patients, but one of the patients didn’t turn up to the clinic. I would have one missing observation. That could look something like this:
When you have a missing value in your observations, R will not be able to calculate statistics from those numbers.
For example mean() will not give you a number, but rather NA:
This is because you can’t calculate the mean of “165, 168, 174, NA, and 170”.
In order to calculate the mean of the numbers only, you must remove any missing observations; you must remove the NA.
The easiest way to do this is by using the “na.rm” argument.
In this example we set na.rm to “TRUE” to tell R to remove the NAs.
Now it can calculate the number for us:
NOTE: This is also required for many of the other basic descriptive statistics calculations you did earlier. Here are a couple more examples:
[1] NA
[1] 3.774917
[1] NA
[1] 165
So, in summary, if you have missing observations in your data, make sure to use na.rm = TRUE.
R can be very helpful in rounding numbers for us.
This is easily done using the round() function.
This is very important, not least because in some of your assignments you will be asked to round numbers to an “appropriate number of decimal places”.
Here’s an example of how we can round a number in R to a specific number of decimal places.
We simply use the round() function:
The “1” in the above tells R to round the number to 1 decimal place.
If you needed to, for example, measure something to a precision of 0.01μg/mL, this would represent 2 decimal places:
We generated specific sequences earlier, and repeated numbers. But R can also generate (simulate) random data. You will come to learn that this is key for statistics.
We can use the rnorm() function to generate data from a normal distribution.
The rnorm() function has three arguments: n, mean, and sd
n = the number of observations you want to simulate
mean = this is the mean value of the normal distribution you want to simulate from
sd = this is the sd of the distribution
Let’s simulate 30 numbers from a distribution with a mean of 5, and a SD of 1:
[1] 6.167669 5.057183 5.941927 5.546070 3.767478 2.886207 3.724482 4.974249
[9] 3.903738 5.550198 6.614280 4.802678 4.443351 6.430941 6.343992 4.115385
[17] 3.761996 4.029446 5.686474 5.694078 5.012846 4.932377 4.364064 3.610650
[25] 5.857590 5.487381 5.785088 3.239031 4.847095 5.534359
Remember, this is a random number generator, so these 30 numbers may not have an exact mean of 5 and a standard deviation of 1. Let’s have a look:
Now you have a go.
Create an object called “rNum2” which simulates 100 numbers from a distribution with a mean of 5 and an sd of 1. Then, calculate the mean and sd of that object.
When you first install R on your device, or use R online for the first time, it will contain lots of different functions that you can use. However, not everything is automatically installed.
If you want to install new things into R, you can! This is called installing packages.
R packages are banks of functions.
To install a new package you need to use the install.packages() function, and make sure to write the name of the package in quotation marks.
For example, install the “maps” package:
Installing package into 'C:/Users/sbiwk/AppData/Local/R/win-library/4.5'
(as 'lib' is unspecified)
package 'maps' successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\sbiwk\AppData\Local\Temp\Rtmp8ClIPq\downloaded_packages
Then you must always run the library() function to open that package:
Now we can use the map() function (which comes from the “maps” package):
NOTE: We will explore more data visualisations like this one in the next script.
Whenever we use R in a report or assignment we should cite it.
To get the R citation we just run citation()
If we want to cite a specific package we can do it like this:
Write your own code to complete the following 6 tasks. NOTE: You will not be assessed on these tasks. They are just designed to encourage you to practice.
NOTE: You aren’t expected to remember how to do all of these yet!
HINT: Look back at earlier sections of your script to find the bits of code that you need, and adapt them (that’s the best way to work in R)
Create an object called “seq1” which consists of a sequence of numbers from 5 to 100, in intervals of 1
Calculate the middle value of seq1
What is the total value of this sequence of numbers?
Install and load the “car” package
Copy and paste here the citation for the “car” package:
Finally, create a random set of ten-thousand numbers from a normal distribution with a mean of 5 and a standard deviation of 2